Introduction
Materials and Methods
Results
Discussion
Conclusion
Introduction
Materials and Methods
Results
Discussion
Conclusion
Data set of Rural People from Bangladesh with or without T1-Diabetes
Contains 306 data points and 22 variables
Exploratory Data Analysis
Classify T1 Diabetes children and explore important variables for T1 diabetes
Obtain data set
Data Wrangling
EDA
Analysis and Modeling
RF model
Shiny App
Working collaboratively using RStudio Cloud and Github
# Load libraries ----------------------------------------------------------
library("tidyverse")
# Load data ---------------------------------------------------------------
my_data_clean <- read_tsv(file = "/cloud/project/data/02_my_data_clean.tsv")
# Wrangle data ------------------------------------------------------------
my_data_clean_aug <- my_data_clean %>%
mutate(Dur_disease = str_extract(`Duration of disease`,"\\d+\\.?\\d*"),
unit = str_replace(`Duration of disease`, Dur_disease,"")) %>%
select(-`Duration of disease`)
# Converting duration to days for every value
my_data_clean_aug <- my_data_clean_aug %>%
mutate(Dur_disease = as.numeric(Dur_disease)) %>%
mutate(Dur_disease = case_when(unit == "d" ~ Dur_disease,
unit == "w" ~ Dur_disease * 7,
unit == "m" ~ Dur_disease * 30,
unit == "y" ~ Dur_disease * 365),
Dur_disease = replace_na(Dur_disease, 0)) %>%
# We do not need the unit column anymore
select(-unit) %>%
# Separating "Other diease" column into three
separate(`Other diease`,
into = c("first_disease",
"second_disease",
"third_disease"),
sep = ",")
## Warning: Expected 3 pieces. Missing pieces filled with `NA` in 305 rows [1, 2, ## 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].
## Results: Data Visualisation
Fig A
A caption
A caption
Data is well seperated so classification seems to be feasible.